Search CORE

18 research outputs found

Efficient Decompression of Binary Encoded Balanced Ternary Sequences

Author: BOURGE Alban
Muller Olivier
Prost-Boucle Adrien
Pétrot Frédéric
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/04/2019
Field of study

International audienceA balanced ternary digit, known as a trit, takes its values in {-1, 0, 1}. It can be encoded in binary as {11, 00, 01} for direct use in digital circuits. In this correspondence, we study the decompression of a sequence of bits into a sequence of binary encoded balanced ternary digits. We first show that it is useless in practice to compress sequences of more than 5 ternary values. We then provide two mappings, one to map 5 bits to 3 trits and one to map 8 bits to 5 trits. Both mappings were obtained by human analysis and lead to Boolean implementations that compare quite favorably with others obtained by tweaking assignment or encoding optimization tools. However, mappings that lead to better implementations may be feasible

HLS-Based Methodology for Fast Iterative Development Applied to Elliptic Curve Arithmetic

Author: Bourge Alban
Leveugle Régis
Maistri Paolo
Muller Olivier
Pontie Simon
Prost-Boucle Adrien
Rousseau Frédéric
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

International audienceHigh-Level Synthesis (HLS) is used by hardware developers to achieve higher abstraction in circuit descriptions. In order to shorten the hardware development time via HLS, we present an adjustment of the Iterative and Incremental Design (IID) methodology, frequently used in software development. In particular, our methodology is relevant for the development of applications with unusual complexity: the method was applied here to the development of large modular arithmetic, commonly used for cryptography applications (e.g., Elliptic Curves). Rapid feedback on circuit characteristics is used to evaluate deep architectural changes in short time, greatly reducing the time-to-market with respect to hand-made designs. In addition, our approach is highly flexible, since the same generic high-level description can be used to produce an entire set of circuits, each with different area/performance trade-offs. Thanks to the proposed approach, any change to the initial specification (e.g., the curve used) is also very fast, while it may require a large effort in the case of hand-made designs

Crossref

Hal - Université Grenoble Alpes

HAL Descartes

High-level synthesis for fast generation of hardware accelerators under resource constraints

Author: Prost-Boucle Adrien
Publication venue
Publication date: 08/01/2014
Field of study

Dans le domaine du calcul générique, les circuits FPGA sont très attrayants pour leur performance et leur faible consommation. Cependant, leur présence reste marginale, notamment à cause des limitations des logiciels de développement actuels. En effet, ces limitations obligent les utilisateurs à bien maîtriser de nombreux concepts techniques. Ils obligent à diriger manuellement les processus de synthèse, de façon à obtenir une solution à la fois rapide et conforme aux contraintes des cibles matérielles visées.Une nouvelle méthodologie de génération basée sur la synthèse d'architecture est proposée afin de repousser ces limites. L'exploration des solutions consiste en l'application de transformations itératives à un circuit initial, ce qui accroît progressivement sa rapidité et sa consommation en ressources. La rapidité de ce processus, ainsi que sa convergence sous contraintes de ressources, sont ainsi garanties. L'exploration est également guidée vers les solutions les plus pertinentes grâce à la détection, dans les applications à synthétiser, des sections les plus critiques pour le contexte d'utilisation réel. Cette information peut être affinée à travers un scénario d'exécution transmis par l'utilisateur.Un logiciel démonstrateur pour cette méthodologie, AUGH, est construit. Des expérimentations sont menées sur plusieurs applications reconnues dans le domaine de la synthèse d'architecture. De tailles très différentes, ces applications confirment la pertinence de la méthodologie proposée pour la génération rapide et autonome d'accélérateurs matériels complexes, sous des contraintes de ressources strictes. La méthodologie proposée est très proche du processus de compilation pour les microprocesseurs, ce qui permet son utilisation même par des utilisateurs non spécialistes de la conception de circuits numériques. Ces travaux constituent donc une avancée significative pour une plus large adoption des FPGA comme accélérateurs matériels génériques, afin de rendre les machines de calcul simultanément plus rapides et plus économes en énergie.In the field of high-performance computing, FPGA circuits are very attractive for their performance and low consumption. However, their presence is still marginal, mainly because of the limitations of current development tools. These limitations force the user to have expert knowledge about numerous technical concepts. They also have to manually control the synthesis processes in order to obtain solutions both fast and that fulfill the hardware constraints of the targeted platforms.A novel generation methodology based on high-level synthesis is proposed in order to push these limits back. The design space exploration consists in the iterative application of transformations to an initial circuit, which progressively increases its rapidity and its resource consumption. The rapidity of this process, along with its convergence under resource constraints, are thus guaranteed. The exploration is also guided towards the most pertinent solutions thanks to the detection of the most critical sections of the applications to synthesize, for the targeted execution context. This information can be refined with an execution scenarion specified by the user.A demonstration tool for this methodology, AUGH, has been built. Experiments have been conducted with several applications known in the field of high-level synthesis. Of very differen sizes, these applications confirm the pertinence of the proposed methodology for fast and automatic generation of complex hardware accelerators, under strict resource constraints. The proposed methodology is very close to the compilation process for microprocessors, which enable it to be used even by users non experts about digital circuit design. These works constitute a significant progress for a broader adoption of FPGA as general-purpose hardware accelerators, in order to make computing machines both faster and more energy-saving

Theses.fr

Génération rapide d'accélérateurs matériels par synthèse d'architecture sous contraintes de ressources

Author: Prost-Boucle Adrien
Publication venue: HAL CCSD
Publication date: 08/01/2014
Field of study

ISBN : 978-2-11-129186-7Even if FPGA circuits are very attractive for their performance and low power consumption, their usage as hardware accelerators is still marginal. Indeed, the existing development tools are only accessible to users with expertise in circuit design. In order to reduce their limits, a novel generation methodology based on high-level synthesis is proposed. By iteratively applying transformations to an initial solution, the process rapidly converges and strictly respects hardware constraints, particularly the available ressources. A demonstration tool, AUGH, has been built, and experiments have been launched with several known applications. THe proposed methodology is very close to the compilation flow for microprocessors, which allows it to be used even by users with no expertise about digital circuit design.Bien que les FPGA soient très attrayants pour leur performance et leur faible consommation, leur emploi en tant qu'accélérateurs matériels reste marginal. Les logiciels de développement existants ne sont en effet accessibles qu'à un public expert en conception de circuits. Afin de repousser leurs limites, une nouvelle méthodologie de génération basée sur la synthèse d'architecture est proposée. En appliquant des transformations successives à une solution initiale, le processus converge rapidement et permet de respecter strictement des contraintes matérielles, notamment en ressources. Un logiciel démonstrateur, AUGH, a été construit, et des expérimentations ont été menées sur plusieurs applications reconnues. La méthodologie proposée est très proche du processus de compilation pour les microprocesseurs, ce qui permet son utilisation même par des utilisateurs non spécialistes de la conception de circuits numériques

Hal - Université Grenoble Alpes

Génération rapide d'accélérateurs matériels par synthèse d'architecture sous contraintes de ressources

Author: Prost-Boucle Adrien
Publication venue: HAL CCSD
Publication date: 08/01/2014
Field of study

Thèses en Ligne

Génération rapide d'accélerateurs matériels par synthèse d'architecture sous contraintes de ressources

Author: Prost-Boucle Adrien
Publication venue: HAL CCSD
Publication date: 08/01/2014
Field of study

In the field of high-performance computing, FPGA circuits are very attractive for their performance and low consumption. However, their presence is still marginal, mainly because of the limitations of current development tools. These limitations force the user to have expert knowledge about numerous technical concepts. They also have to manually control the synthesis processes in order to obtain solutions both fast and that fulfill the hardware constraints of the targeted platforms.A novel generation methodology based on high-level synthesis is proposed in order to push these limits back. The design space exploration consists in the iterative application of transformations to an initial circuit, which progressively increases its rapidity and its resource consumption. The rapidity of this process, along with its convergence under resource constraints, are thus guaranteed. The exploration is also guided towards the most pertinent solutions thanks to the detection of the most critical sections of the applications to synthesize, for the targeted execution context. This information can be refined with an execution scenarion specified by the user.A demonstration tool for this methodology, AUGH, has been built. Experiments have been conducted with several applications known in the field of high-level synthesis. Of very differen sizes, these applications confirm the pertinence of the proposed methodology for fast and automatic generation of complex hardware accelerators, under strict resource constraints. The proposed methodology is very close to the compilation process for microprocessors, which enable it to be used even by users non experts about digital circuit design. These works constitute a significant progress for a broader adoption of FPGA as general-purpose hardware accelerators, in order to make computing machines both faster and more energy-saving.Dans le domaine du calcul générique, les circuits FPGA sont très attrayants pour leur performance et leur faible consommation. Cependant, leur présence reste marginale, notamment à cause des limitations des logiciels de développement actuels. En effet, ces limitations obligent les utilisateurs à bien maîtriser de nombreux concepts techniques. Ils obligent à diriger manuellement les processus de synthèse, de façon à obtenir une solution à la fois rapide et conforme aux contraintes des cibles matérielles visées.Une nouvelle méthodologie de génération basée sur la synthèse d'architecture est proposée afin de repousser ces limites. L'exploration des solutions consiste en l'application de transformations itératives à un circuit initial, ce qui accroît progressivement sa rapidité et sa consommation en ressources. La rapidité de ce processus, ainsi que sa convergence sous contraintes de ressources, sont ainsi garanties. L'exploration est également guidée vers les solutions les plus pertinentes grâce à la détection, dans les applications à synthétiser, des sections les plus critiques pour le contexte d'utilisation réel. Cette information peut être affinée à travers un scénario d'exécution transmis par l'utilisateur.Un logiciel démonstrateur pour cette méthodologie, AUGH, est construit. Des expérimentations sont menées sur plusieurs applications reconnues dans le domaine de la synthèse d'architecture. De tailles très différentes, ces applications confirment la pertinence de la méthodologie proposée pour la génération rapide et autonome d'accélérateurs matériels complexes, sous des contraintes de ressources strictes. La méthodologie proposée est très proche du processus de compilation pour les microprocesseurs, ce qui permet son utilisation même par des utilisateurs non spécialistes de la conception de circuits numériques. Ces travaux constituent donc une avancée significative pour une plus large adoption des FPGA comme accélérateurs matériels génériques, afin de rendre les machines de calcul simultanément plus rapides et plus économes en énergie

Thèses en Ligne

Hal - Université Grenoble Alpes

HAL Descartes

Méthodologie de génération rapide et automatique d’accélérateurs matériels sous contraintes de ressources : progression itérative et gloutonne

Author: Muller Olivier
Prost-Boucle Adrien
Rousseau Frédéric
Publication venue: HAL CCSD
Publication date: 15/01/2013
Field of study

National audienc

Hal - Université Grenoble Alpes

HAL Descartes

Méthodologie de génération rapide et automatique d’accélérateurs matériels sous contraintes de ressources : progression itérative et gloutonne

Author: Muller Olivier
Prost-Boucle Adrien
Rousseau Frédéric
Publication venue: HAL CCSD
Publication date: 15/01/2013
Field of study

National audienc

HAL Descartes

High-Efficiency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression

Author: BOURGE Alban
Prost-Boucle Adrien
Pétrot Frédéric
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

International audienceAlthough performing inference with artiicial neural networks (ANN) was until quite recently considered as essentially compute intensive, the emergence of deep neural networks coupled with the evolution of the integration technology transformed inference into a memory bound problem. This ascertainment being established, many works have lately focused on minimizing memory accesses, either by enforcing and exploiting sparsity on weights or by using few bits for representing activations and weights, so as to be able to use ANNs inference in embedded devices. In this work, we detail an architecture dedicated to inference using ternary {−1, 0, 1} weights and activations. This architecture is conngurable at design time to provide throughput vs power trade-oos to choose from. It is also generic in the sense that it uses information drawn for the target technologies (memory geometries and cost, number of available cuts, etc) to adapt at best to the FPGA resources. This allows to achieve up to 5.2k fps per Watt for classiication on a VC709 board using approximately half of the resources of the FPGA. Additional Key Words and Phrases: Ternary CNN, low power inference, hardware acceleration, FPGA ACM Reference format: Adrien Prost-Boucle, Alban Bourge, and Frédéric Pétrot. 2018. High-EEciency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression

Hal - Université Grenoble Alpes

Hal-Diderot

Fast and Standalone Design Space Exploration for High-Level Synthesis under Resource Constraints

Author: Muller Olivier
Prost-Boucle Adrien
Rousseau Frédéric
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceThe very high computing capacity available in the latest Field Programmable Gate Array (FPGA) compo- nents allows to extend their application fields, in High-Performance Computing (HPC) as well as in embedded applications. This paper presents a new methodology for Design Space Exploration (DSE) in the context of High Level Synthesis (HLS) for HPC and embedded systems targeting FPGAs.This new methodology provides very quickly an RTL description of the design under resources constraints. An autonomous flow is described, that performs incremental transformations of the input design description. The low complexity of the transformation evaluation, decision and exploration algorithms, associated with a greedy progression, makes this DSE methodology very fast. Moreover, this methodology respects a strict resource constraint given as bare FPGA primitive amounts. Hence, the generated design fits into the targeted FPGA or a partition of it. Such a methodology leads to autonomous, fast and transparent DSE, all these issues known to limit the use of HLS.Results on several benchmarks highlight the capabilities of our DSE methodology. The results show a high generation time speed-up compared to one other existing HLS approach, while preserving correct performance of the generated circuits

Hal - Université Grenoble Alpes

HAL Descartes